Complete Genome and Methylome Analysis of the Box-Shaped Halophilic Archaeon Haloarcula sinaiiensis ATCC 33800

ABSTRACT The genome of halophilic archaeon Haloarcula sinaiiensis ATCC 33800 was sequenced and assembled and comprises seven replicons. Four m6A and one m4C modified motifs and their responsible methyltransferase genes have been identified in the genome by single-molecule real-time (SMRT) sequencing and bioinformatic analysis.

H aloarcula sinaiiensis ATCC 33800 is a halophilic archaeon isolated from brine samples collected at the Red Sea Sabkha Gavish (1). The cells are flat and angular, including irregular rectangles, triangles, and squares. H. sinaiiensis is extremely halophilic with a requirement for more than 2 M NaCl for growth and 3 to 4 M NaCl for optimal growth (2)(3)(4).
A Pacific Biosciences RS II instrument and SMRTAnalysis pipeline were used for sequencing, contig assembly, and modification analysis as in reference 5. The H. sinaiiensis ATCC 33800 strain was a gift from ATCC. Briefly, cells were grown in 500 ml ATCC 1230 Haloarcula medium and trace elements (3.50 mg/ml FeSO 4 Á 7H 2 O, 0.88 mg/ml ZnSO 4 Á 7H 2 O, 0.66 mg/ ml MnSO 4 Á H 2 O, and 0.02 mg/ml CuSO 4 Á 5H 2 O [pH 7.4]) diluted 2,000-fold at 37°C for 36 h and then centrifuged to recover cells. Pellets were resuspended in 100 ml of a 25% NaCl salt solution and lysed by osmotic shock by addition of a 10 mM EDTA solution until visual clearing occurred. High-molecular-weight DNA was harvested. The archaeal lysate was extracted three times with an equal volume of phenol saturated with Tris-HCl (pH 8) followed by dialysis and treatment with RNase (6).
A total of 10mg of genomic DNA (gDNA) was used to prepare single-molecule real-time (SMRT) libraries. Briefly, SMRTbell libraries were constructed from a genomic DNA sample sheared to ;10 to 20 kb using the G-tubes protocol (Covaris, Woburn, MA, USA), end repaired (NEBnext end repair, E6050; New England BioLabs [NEB], Ipswich, MA, USA), and ligated (Quick Ligation kit, M2200S; NEB) to PacBio hairpin adapters. Incompletely formed SMRTbell templates and linear DNAs were digested with a combination of exonuclease III and exonuclease VII (New England BioLabs). DNA qualification and quantification were performed using the Qubit fluorimeter (Invitrogen, Eugene, OR) and 2100 Bioanalyzer (Agilent Technology, Santa Clara, CA).
Three 6-, 10-, and 13-kb SMRTbell libraries were prepared according to the PacBio sample preparation protocol (https://www.pacb.com/wp-content/uploads/2015/09/Guide -Pacific-Biosciences-Template-Preparation-and-Sequencing.pdf), including additional separation on a BluePippin system (Sage Science, Beverly, MA) and sequencing with C4-P6 chemistry, using 9 SMRT cells, as follows: 5 with non-size selected (6 and 10 kb) and 4 with size selected 13-kb libraries with a 360-minute collection time for each library. A total of 3,048 Gb sequencing data in 274,386 polymerase reads with mean subread lengths of 4,980 bp and N 50 subread length of 7,635 bp were obtained (;426Â coverage) and de novo assembled using HGAP_Assembly.3 version 2.3.0 with default quality and read length parameters and 3 times polished using Quiver (7). The polished assemblies generated seven closed circular contigs ( Table 1). The assembled sequences were annotated using the NCBI Prokaryotic Genomes Annotation Pipeline (PGAP) (8,9). Previously, the H. sinaiiensis ATCC 33800 genome sequence was deposited in GenBank as 84 contigs based on short read sequencing as AOLR01000000.1.
Four m6A and one m4C modified DNA motifs were detected by the SMRT motif and modification analysis version 2.3.0 (10)(11)(12). Scanning the genome using the Seqware program (13) predicted nine methyltransferase (MTase) genes in the genome. All five motifs and their corresponding MTase genes were predicted based on homology (14) to previously known MTases. The results are presented in Table 1 and have been deposited in REBASE (15).

ACKNOWLEDGMENTS
A.F. and R.J.R. work for New England BioLabs, a company that sells research reagents, including restriction enzymes and DNA methyltransferases, to the scientific community.
The DasSarma laboratory was supported by NASA grant 80NSSC17K0263 and NIH grant AI139808.